Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAS-2236 - Updates to methods that calculate the points used to create the dimensions #18

Open
wants to merge 16 commits into
base: DAS-2208-SMAP-L3
Choose a base branch
from

Conversation

sudha-murthy
Copy link
Collaborator

Description

  • functions added to be able to do end toe nd test
  • updated functions that calculated the two sets lat/lon points to calculate the dimension arrays
  • A couple of unit tests need to be added
  • This does not support any test for ordering, multiple grids or 3D support

Jira Issue ID

DAS-2236

Local Test Steps

  • Make sure the unit tests pass
  • if you can test with harmony in a box with mask fill disabled - you can test the following url

http://localhost:3000/C1268452378-EEDTEST/ogc-api-coverages/1.0.0/collections/parameter_vars/coverage/rangeset?forceAsync=true&granuleId=G1268452388-EEDTEST&subset=lat(70%3A80)&subset=lon(-160%3A-150)&variable=Soil_Moisture_Retrieval_Data_AM%2Fstatic_water_body_fraction&skipPreview=true
and verify you can check the outputs in panoply

PR Acceptance Checklist

  • [X ] Jira ticket acceptance criteria met.
  • CHANGELOG.md updated to include high level summary of PR changes.
  • docker/service_version.txt updated if publishing a release.
  • [X ] Tests added/updated and passing.
  • Documentation updated (if needed).

hoss/coordinate_utilities.py Show resolved Hide resolved
hoss/coordinate_utilities.py Show resolved Hide resolved
hoss/coordinate_utilities.py Show resolved Hide resolved
hoss/coordinate_utilities.py Show resolved Hide resolved
hoss/coordinate_utilities.py Outdated Show resolved Hide resolved
hoss/coordinate_utilities.py Outdated Show resolved Hide resolved
hoss/coordinate_utilities.py Show resolved Hide resolved
hoss/coordinate_utilities.py Outdated Show resolved Hide resolved
hoss/coordinate_utilities.py Outdated Show resolved Hide resolved
hoss/coordinate_utilities.py Outdated Show resolved Hide resolved
hoss/dimension_utilities.py Outdated Show resolved Hide resolved
hoss/dimension_utilities.py Show resolved Hide resolved
hoss/dimension_utilities.py Show resolved Hide resolved
Copy link
Member

@owenlittlejohns owenlittlejohns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look, and I appreciate this PR being another more bite-sized piece of progress towards the end goal.

I think there are few themes to my comments:

  • You haven't added sufficient unit tests to provide thorough coverage of your changes. In most cases, you've got some tests for new functions, but not all. The tests you've added are generally good, but often are only testing the happy path. It's pretty important to also test realistic edge-cases, which isn't really done in this PR.
  • There are quite a lot of places where you are performing very similar tasks, but in some cases along a row, rather than up-and-down a column. Some of those tasks are basically the same thing (performing some operation on a 1-D slice of an array), so it would be cleaner to write a function to do something to a 1-D slice and then invoke it with the correct 1-D slice for either a row or a column.
  • I think generally, there are a few places where the documentation strings for each function could do with a bit more descriptive text. This will be the number one location that future developers try to read to understand what the code is meant to do.
  • General coding comments.

@@ -3,16 +3,16 @@
coordinate variable data to projected x/y dimension values
"""

from typing import Dict
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think the better way to go is just to use the Python built-in dict, much as you already do for list and tuple. That means you don't need to import typing.Dict here. (When we don't have a big feature on the boil for HOSS, I'm going to go through and clean up the other modules that currently use typing.Dict, typing.List, etc, for consistency)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated. - 1c29221

@@ -101,20 +101,23 @@ def get_coordinate_variables(
CF-Convention coordinates metadata attribute. It returns them in a specific
order [latitude, longitude]"
"""

coordinate_variables_list = varinfo.get_references_for_attribute(
# varinfo returns a set and not a list
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you really need this comment? It seems a bit superfluous.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated - 1c29221

@@ -101,20 +101,23 @@ def get_coordinate_variables(
CF-Convention coordinates metadata attribute. It returns them in a specific
order [latitude, longitude]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is a comment on the previously written documentation string, but this description strongly implies the return of this function is a list containing a single latitude and longitude variable. Instead it returns a list with two sub-lists (one each of latitude and longitude variable names).

Also,

This function returns latitude and longitude variables...

probably would be clearer as:

This function returns latitude and longitude variable names...

(E.g., it specifically returns the names of the variables, not the variables themselves)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done - 1c29221

for coordinate in coordinate_variables_list
if varinfo.get_variable(coordinate).is_latitude()
for coordinate in coordinate_variables
if varinfo.get_variable(coordinate) is not None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Same comment for the similar change in the list comprehension for longitude_coordinate_variables)

Where do you catch any issues if there is a reference to a variable in a coordinates metadata attribute that isn't present in the file? If it's pointing to a latitude or longitude variable that isn't present in the file, at some point that's going to cause a failure when you try to access that variable. Is that gracefully handled somewhere?

(It's likely this is already nicely supported, but it's been a while since the previous PR and I just can't remember)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate comment - you have not added unit tests to prove this case is being handled as expected. (Needed both for latitude or longitude)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • this method is first called before prefetch. it should not fail if the variables don't exist.
  • if it is present in the variable and not in the file - it will fail during prefetch. That is when it is first accessed from the file. - will check if I have a test on that
  • There is a unit test for the function - will add subtests if the variables don't exist

Copy link
Collaborator Author

@sudha-murthy sudha-murthy Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hoss/coordinate_utilities.py Outdated Show resolved Hide resolved
Comment on lines 270 to 272
x_y_points = list(zip(points_x, points_y))

return x_y_points
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can avoid declaring a variable for a single immediate use and just do:

return list(zip(points_x, points))

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. - 1c29221

Comment on lines 257 to 259
points: list[tuple], # list of data points as tuple in (lon, lat) order
crs: CRS, # CRS object from PyProj
) -> list[tuple]: # list of X-Y points as tuple in (X, Y) order
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another place where the in-line comments should go in the documentation string.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done - 2f07d1a

"""Ensure that two valid lat/lon points returned by the method
with a set of lat/lon coordinates as input

"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice tests. I think there are a couple of additional missing test-cases, though:

  • There is only a single valid pixel.
  • There are no valid pixels.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly also what happens when there are only valid points in a single row or single column (for geographic things).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated unit tests - a3680c5

returned by the method with a set of lat/lon coordinates as input

"""
with self.subTest('Get two sets of valid indices from coordinates dataset'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comments to a couple of other places. This unit test is only really testing the happy-path. You aren't really throwing any potential edge-cases at the function. Other test cases are:

  • Only a single valid pixel.
  • Only valid pixels in a single row.
  • Only valid pixels in a single column.
  • No valid pixels.

These seem like overkill, but it's really important to know how this code behaves in realistic edge-cases, so we know failure is graceful.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated unit tests - 57744a6

@@ -328,7 +328,7 @@ def test_get_fill_slice(self):

@patch('hoss.dimension_utilities.add_bounds_variables')
@patch('hoss.dimension_utilities.get_opendap_nc4')
def test_prefetch_dimension_variables(
def test_get_prefetch_variables(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that you've added conditional logic to the get_prefetch_variables function, you need to add more test-cases for it. The get_prefetch_variables function will need a subtest (or even separate tests if you prefer) that runs through each branch of code.

Right now, I think this test just ensures variables with dimensions, but not bounds, behave nicely. There's also test_prefetch_dimensions_with_bounds lower down in the file that ensures variables with dimensions and bounds work well. But there's nothing invoking prefetch_variables for requested variables with anonymous dimensions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated - 3b0a149

@D-Auty
Copy link

D-Auty commented Nov 22, 2024

I'll check in after Owen has had a chance to review. I've reviewed my suggestions and updates made are ok.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants